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1. INTRODUCTION 

The prediction of the bio composites behavior is a critical problem both in bio composites research, 
design, and development. Several research were conducted to model bio composites based on natural 
loads [1]-{7] physical and mathematical-based models were the standard key to predict bio composites 
properties. However, these models are more idealistic because they are restricted to unbroken interfacial 
conditions and perfect microstructures. Other numerical models describing several physical scenarios of bio 
composites were studied. Nevertheless, these phenomenological methods usually use complicated analytical 
expressions challenging to resolve computationally demanding. As well, these models often depended on 
different empirical parameters obtained from experimentation. This limits using the phenomenological 
models because the parameters obtained are limited to the studied bio composite [8]. From a simulating 
standpoint, the desired goal of these models is facilitating the simulating design process of bio composites. In 
other hand, the predictive methods are studied to help in choosing the best appropriate bio composite 
constituents (matrix type, fiber type architecture, sizing, and content) so that resulting bio composite part will 
be able to carry the anticipated loads to specific applications. 

The importance of predicting and characterizing the bio composites behavior has conducted to 
several published research in the subject [9]-[12]. However, these investigations are usually assigned to a 
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specific combination of process parameters, manufacturing process and materials constituent or life 
conditions. A lot of studies are focused on the optimization of particular manufacturing process by using the 
effects of process parameters in the behavior of bio composites, with keeping matrix type, fiber type 
architecture, sizing, and content constant. On the other hand, studying the manufacturing processes effect on 
the resulting bio composite [13]. Other research focus on life conditions effects, like environmental history 
and loading, on the studied bio composites behavior [14], [15]. Each of these researches will help to 
understand and predict a specific bio composite behavior. However, they are not utilized far away a specific 
application, limiting their wider utility. When modeling a bio composite, researchers generally mention their 
own tests and experiences and are based on predefined methods in selection, property predictions and 
materials classification. Practicing engineers would rarely study outside their zone of comfort to experiment 
new processes or constituents, as the introduction of a new process will imply a rigorous effort of trial and 
error with a high cost prior to reaching the maturity and profitability process. We can avoid this problem of 
expensive experiments trial and error during the modeling of bio composites through using the appropriate 
collective knowledge in bio composite field. The several studies investigating these materials behavior could 
be manipulated by machine learning to select the suitable manufacturing process and components materials 
for any particular bio composite application. 

Unlike physical methods, during the modeling of bio composites, machine learning (ML) models 
could be highly efficient as they enable managing large and high dimension of data sets in order to obtain the 
best desired behavior [16], [17]. Nowadays, these intelligent methods are used to create predictive approach 
for biomaterial modeling. For example, these techniques are successfully applied to predict some metals 
properties, like microstructure [18] and plastic behavior [19]. They are also studied for atomistic modeling 
[20], electronic component [21] and predictions of chemical similarity [22]. In these conditions, biomaterial 
scientists want to benefit from the comprehension and implementation of some powerful machine learning 
methods, to characterize or predict the behavior of bio composites. In this study, we will present a systematic 
methodology to predict the young modulus of polypropylene reinforced with horn powder using supervised 
ML models. 


2. PROPOSED METHOD 

The method proposed for using ML techniques to predict targeted properties of the bio composite 
includes three phases: i) preparation of data, ii) ML model building, and iii) evaluation. As described 
previously, each phase is divided to several steps. The main ingredients of the method proposed are 
illustrated schematically in Figure 1. 


2.1. First phase: data preparation 

The first stage includes gathering all data available pertinent to the bio composites behavior. The 
objective is to assemble an understood data base which can be used to build a predictive machine learning 
model. As presented in Figure 2, this stage consists itself of 3 steps: data structuring, compilation, and 
cleaning. 


Step 1: Step 1: 
Data Preparation ha) Data Structuring ia) 
UR Step 2: UR Step 2: 
ML Model Building * Data Collection * 
(| Step 3: | ; Step 3: 
Model Evaluation Data Cleaning 


Figure 1. The steps of predicting bio composites Figure 2. The data preparation steps of the ML 
behavior with machine learning Methodology 


2.1.1. Data structuring 

In this stage, it is important to carefully decide process variable and appropriate material relevant 
influencing the modeled bio composite. For instance, all parameters known or suspected of influencing the 
rigidity in polymer bio composites need to be used throughout this stage to stop the duplicating process. The 
relative significance of these parameters, often referred to explanatory variables in ML context by respecting 
the output behavior could usually be calculated once the machine learning model is constructed. Certainly, 
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natural fibers parameters like the content, type and architecture have to be considered. Likewise, it is 
important to consider the matrix type, filler, and content of filler in addition to the processing temperature. 
The processing processes used to produce the polymer bio composites may also influence their behavior and 
have to be considered. Furthermore, parameters that reflect the conditions of the life span or other essential 
variables, as they are known to influence the bio composites behavior, component thickness and void content 
(or a typical dimension) must be considered. Most articles published and sources data will not document all 
variables considered. 

Additional attractive property of a lot of ML methods is the probability of building models with 
missing values in the constructed database. A distinction of every explanatory variable is needed in most 
instances. For example, natural fibers architecture could be categorized into hybrid, random, multidirectional, 
and unidirectional architectures. The type of matrix can be classified into the main resins known and so on. 
This step is crucial as a careful classification of explanatory variables, and a good-defined distinction for 
each one of them is necessary for any learning method algorithm to be used successfully. 


2.1.2. Data collection 

In this phase, extensive bio composite data relative to the predicted behavior needed to be generated 
from scientific publications, industrial data technical reports of all phases implied in the method proposed, 
this phase is the most consuming and tedious time. However, it is the constructed bloc upon which the model 
is constructed. The ML predictive model accuracy would depend on the diversity and size of the analyzed 
database. Consequently, an evolving database could be built progressively. In the process, several published 
materials focusing on the behavior needed to be checked before the data could be retrieved. This phase is 
ideally performed after structuring data to avoid the duplicating collection data. 


2.1.3. Data cleaning 

In this stage, to ensure quality of data, the accuracy of all recovered values needs to be evaluated. 
The accuracy of machine learning predictive models could be immensely harmed by incorrect data. These 
inaccurate data could occur during recovering data from literature and inserting the data into the database. All 
inserted values have to be double tested to ensure that no incorrect value was wrongly included in database. 
In situation of doubt, same author’s publisher research has to be evaluated to verify the value reported. 


2.2. Second phase: building the machine learning model 

Second phase also involves 3 steps as presented in Figure 3. An efficient machine learning method 
needs to be chosen. After, the trained data have to be selected from the dataset collected in order to apply the 
chosen ML algorithm for constructing the model. The final step involves constructing a predictive ML model 
that could imply a relation between a possible explanatory set of variables (as well as independent variables) 
and a particular property of material (dependent response or variable). The model prediction performance 
could be evaluated to check the input variable’s ability to explain the difference in the targeted bio composite 


behavior. 
Step 1: 
ML Technique Selection ba) 
Q, Step 2: 
Training Data Selection a) 
( Step 3: 
Model Building 


Figure 3. The model building steps of the ML methodology 


2.2.1. Machine learning method selection 

A wide variety of sophisticated learning algorithms are accessible, using neural networks, decision 
trees, Bayesian networks, and among others. This step implies a careful choice of the most supervised 
appropriate technique for simulating the predicted behavior. For example, in case of missed values. The 
random-forests algorithm could be chosen as it deals well with ignored values and mixed data. Other forms 
of the database compiled have to be taken in consideration like the nonlinearity handling and robustness to 
outliers. 
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2.2.2. Training selection data 

In order to determine the predictive ability of the ML algorithm chosen, the processed dataset 
requires being separate into 2 subsets: i) a trained dataset containing most of the data collected and 
ii) remaining data, called unknown or validation data. The training data size have to be a balance between 
how training the model, and how evaluating its predictive ability. The chosen ML algorithm would be used 
to the trained data to construct a predictive model explaining the analyzed bio composite behavior 
variations based on a large variety of process and material variables (explanatory or independent variables). 


2.2.3. Model building 

In this phase, the chosen algorithm is now run on the trained data. The obtained predictive model 
will provide a correlation between the bio composite properties and the explanatory variables with a 
quantitative predictive precision. Several algorithms may determine each independent variables 
contribution the interpretation of the response variability in the resulted predictive model. 


2.3. Model evaluation 

Phase 3’s objective is the predictive performance evaluation of the model. Consequently, the ML 
predictive performance model requires to be checked using the ignored data, also related to as data test. The 
chosen algorithm has to be used for predicting the unseen dataset response, and the responses predicted have 
to be compared to the actual responses. The accuracy obtained can be a best indicator of the bio composites 
behavior predictive model to be designed. 

Another important characteristic of most ML method is their capacity to discern the explanatory 
variables effect on the behavior predicted. Explanatory variables can therefore be categorized to their effect 
on the targeted bio composite behavior. This could be more helpful when extending the database by trying to 
redefine the explanatory variables to be collected in the first phase. 


3. RESULTS AND DISCUSSION 
3.1. Predicting Young modulus of polypropylene reinforced by horn fibers with the finite element 
method 
In order to test the performance of the machine learning model constructed, we used the finite 
element method widely known by its efficiency in determining the properties of complex structures but with a 
large time of computing [20]-[25]. The number of elements and nodes is respectively 170,152 and 290,352. 
The Figures 4 and 5 show respectively the geometry created and the mesh obtained by the FEM [24]-[28]. 


Figure 4. Geometry created for the bio composite Figure 5. Mesh obtained for the bio composite 


3.2. Predicting the Young modulus of polypropylene reinforced by horn fibers with machine learning 
model 

Using the analyzed methodology, we studied the predicted Young modulus of polypropylene 
reinforced by horn fibers. The Figure 6 presents the machine learning model. The Tables 1 and 2 present the 
machine learning algorithms and progress parameters. The evaluation of machine learning neural network 
performance could be performed by two indicators. They are the correlation-coefficient (R) and mean square- 
error (MSE). Figures 7, 8, and 9 present the MSE, training network and correlation-coefficient at 6 epochs. 
The more R value is nearest to 1 and the more the MSE is smaller, the better NN is efficient. Smaller value of 
MSE (0.827) is determined for the bio composite Young modulus prediction (1229.0963 MPa) in comparison 
with the finite element method (1229.2 MPa). The best performance is also observed in terms of the 
correlation-coefficient, with an R closest to 0.99999 for training and validation and 0.97503 for test which 
validate the performance of the machine learning neural network model selected. 
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Hidden Layer Output Layer 


10 1 


Figure 6. Machine learning model 


Table 1. Machine learning algorithms parameters 
Parameters Type 
Data division algorithm Random 
Training algorithm Levenberg-Marquardt 
Performance algorithm | Mean-Squared Error 
Calculation’s algorithm MEX 


Table 2. Machine learning progress 


Parameters Values 
Iterations 6 
Performance 0.679 
Gradient 11.2 
Mu 0.1 


Validation Checks 6 


10° 
—e Train 
10° —* Validation 
=A Test 


Mean Squared Error (MSE) 


6 Epochs 


Figure 7. Machine learning model 


Gradient = 11.2273, at epoch 6 


gradient 


Mu = 0.1, at epoch 6 


Validation Checks = 6, at epoch 6 


6 Epochs 


Figure 8. Machine learning training state 
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Figure 9. Machine learning correlation coefficient 


4. CONCLUSION 

A general methodology is studied for predicting the bio composites behavior by using supervised 
ML models. The machine learning method will predict particular bio composite properties by taking into 
account their manufacturing processes constituents, expected life span and process parameters. The chosen 
methodology, based on the application of ML methods to the all knowledge in the bio composite materials 
field, implies three stages. First, a database is compiled from technical reports, industrial data, and research 
articles. Second, the best appropriate supervised statistical learning is utilized to construct a predictive 
method that explains the investigated behavior based on a best process variables and selected material. Third, 
the built model predictive performance is assessed, and the used explanatory variables importance is 
evaluated. This anticipated approach offers the efficiency to significantly get better the design process for bio 
composite materials. It will improve the trial-error iterations by the refinement of the primary selection of bio 
composite manufacturing process parameters and constituents for any particular application. In this paper, we 
studied the mechanical behavior of polypropylene reinforced by horn fibers. It will be attractive to study the 
thermo mechanical behavior and perform other tests on this bio composite like aging and fatigue tests. 
Additional bio loads of plant or animal source can be analyzed for the reinforcement of this polymer using 
digital modeling and experimental tests. 
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